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Abstract — A lower bound on the minimum mean-squared 
error (MSE) in a Bayesian estimation problem is proposed in 
this paper. This bound utilizes a well-known connection to the 
deterministic estimation setting. Using the prior distribution, the 
bias function which minimizes the Cramer-Rao bound can be 
determined, resulting in a lower bound on the Bayesian MSE. 
The bound is developed for the general case of a vector parameter 
with an arbitrary probability distribution, and is shown to be 
asymptotically tight in both the high and low signal-to-noise ratio 
regimes. A numerical study demonstrates several cases in which 
the proposed technique is both simpler to compute and tighter 
than alternative methods. 

Index Terms — Bayesian bounds, Bayesian estimation, mini- 
mum mean-squared error estimation, optimal bias, performance 
bounds. 



I. Introduction 

The goal of estimation theory is to infer the value of 
an unknown parameter based on observations. A common 
approach to this problem is the Bayesian framework, in which 
the estimate is constructed by combining the measurements 
with prior information about the parameter [1]. In this setting, 
the parameter is random, and its distribution describes 
the a priori knowledge of the unknown value. In addition, 
measurements x are obtained, whose conditional distribution, 
given 0, provides further information about the parameter. The 
objective is to construct an estimator 0, which is a function 
of the measurements, so that is close to in some sense. A 
common measure of the quality of an estimator is its mean- 
squared error (MSE), given by E{\\0 - 0\\ 2 }. 

It is well-known that the posterior mean £?{0|;e} is the 
technique minimizing the MSE. Thus, from a theoretical 
perspective, there is no difficulty in finding the minimum 
MSE (MMSE) estimator in any given problem. In practice, 
however, the complexity of computing the posterior mean 
is often prohibitive. As a result, various alternatives, such 
as the maximum a posteriori (MAP) technique, have been 
developed [2]. The purpose of such methods is to approach the 
performance of the MMSE estimator with a computationally 
efficient algorithm. 

An important goal is to quantify the performance degra- 
dation resulting from the use of these suboptimal techniques. 
One way to do this is to compare the MSE of the method 
used in practice with the MMSE. Unfortunately, computation 
of the MMSE is itself infeasible in many cases. This has led 
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to a large body of work seeking to find simple lower bounds 
on the MMSE in various estimation problems [3]— [12]. 

Generally speaking, previous bounds can be divided into 
two categories. The Weiss-Weinstein family is based on a 
covariance inequality and includes the Bayesian Cramer-Rao 
bound [3], the Bobrovski-Zakai bound [8], and the Weiss- 
Weinstein bound [9], [10]. The Ziv-Zakai family of bounds 
is based on comparing the estimation problem to a related 
detection scenario. This family includes the Ziv-Zakai bound 
[4] and its improvements, notably the Bellini-Tartara bound 
[6], the Chazan-Zakai-Ziv bound [7], and the generalization 
of Bell et al. [11]. Recently, Renaux et al. have combined both 
approaches [12]. 

The accuracy of the bounds described above is usually 
tested numerically in particular estimation settings. Few of 
the previous results provide any sort of analytical proof 
of accuracy, even under asymptotic conditions. Bellini and 
Tartara [6] briefly discuss performance of their bound at high 
signal-to-noise ratio (SNR), and Bell et al. [11] prove that their 
bound converges to the true value at low SNR for a particular 
family of Gaussian-like probability distributions. To the best 
of our knowledge, there are no other results concerning the 
asymptotic performance of Bayesian bounds. 

A different estimation setting arises when one considers 
as a deterministic unknown parameter. In this case, too, 
a common goal is to construct an estimator having low MSE. 
However, the term MSE has a very different meaning in the 
deterministic setting, since in this case, the expectation is taken 
only over the random variable x. One elementary difference 
with far-reaching implications is that in the Bayesian case, the 
MSE is a single real number, whereas the deterministic MSE 
is a function of the unknown parameter [13]— [15], 

Many lower bounds have been developed for the determin- 
istic setting, as well. These include classical results such as 
the Cramer-Rao [16], [17], Hammersley-Chapman-Robbins 
[18], [19], Bhattacharya [20], and Barankin [21] bounds, as 
well as more recent results [22]-[27]. By far the simplest and 
most commonly used of these approaches is the Cramer-Rao 
bound (CRB). Like most other deterministic bounds, the CRB 
deals explicitly with unbiased estimators, or, equivalently, 
with estimators having a specific, pre-specified bias function. 
Two exceptions are the uniform CRB [23], [25] and the 
minimax linear-bias bound [26], [27]. The CRB is known to 
be asymptotically tight in many cases, even though many later 
bounds are sharper than it [14], [25], [28]. 

Although the deterministic and Bayesian settings stem from 
different points of view, there exist insightful relations between 
the two approaches. The basis for this connection is the fact 
that by adding a prior distribution for 0, any deterministic 



problem can be transformed to a corresponding Bayesian set- 
ting. Several theorems relate the performance of corresponding 
Bayesian and deterministic scenarios [13]. As a consequence, 
numerous bounds have both a deterministic and a Bayesian 
version [3], [10], [12], [29]. 

The simplicity and asymptotic tightness of the deterministic 
CRB motivate its use in problems in which 9 is random. 
Such an application was described by Young and Westerberg 
[5], who considered the case of a scalar 9 constrained to 
the interval [6 ,8i]. They used the prior distribution of 9 to 
determine the optimal bias function for use in the biased CRB, 
and thus obtained a Bayesian bound. It should be noted that 
this result differs from the Bayesian CRB of Van Trees [3]; 
the two bounds are compared in Section IH-CI We refer to 
the result of Young and Westerberg as the optimal-bias bound 
(OBB), since it is based on choosing the bias function which 
optimizes the CRB using the given prior distribution. 

This paper provides an extension and a deeper analysis 
of the OBB. Specifically, we generalize the bound to an 
arbitrary n-dimensional estimation setting [30]. The bound 
is determined by finding the solution to a certain partial 
differential equation. Using tools from functional analysis, we 
demonstrate that a unique solution exists for this differential 
equation. Under suitable symmetry conditions, it is shown that 
the method can be reduced to the solution of an ordinary 
differential equation and, in some cases, presented in closed 
form. 

The mathematical tools employed in this paper are also used 
for characterizing the performance of the OBB. Specifically, it 
is demonstrated analytically that the proposed bound is asymp- 
totically tight for both high and low SNR values. Furthermore, 
the OBB is compared with several other bounds; in the 
examples considered, the OBB is both simpler computationally 
and more accurate than all relevant alternatives. 

The remainder of this paper is organized as follows. In Sec- 
tion [II] we derive the OBB for a vector parameter. Section [Til] 
discusses some mathematical concepts required to ensure the 
existence of the OBB. In Section [IV] a practical technique for 
calculating the bound is developed using variational calculus. 
In Section [V] we demonstrate some properties of the OBB, 
including its asymptotic tightness. Finally, in Section [VI] we 
compare the performance of the bound with that of other 
relevant techniques. 

II. The Optimal-Bias Bound 

In this section, we derive the OBB for the general vector 
case. To this end, we first examine the relation between the 
Bayesian and deterministic estimation settings (Section [II- At . 
Next, we focus on the deterministic case and review the basic 
properties of the CRB (Section III-Bb . Finally, the OBB is 
derived from the CRB (Section HFCl 

The focus of this paper is the Bayesian estimation prob- 
lem, but the bound we propose stems from the theory of 
deterministic estimation. To avoid confusion, we will indicate 
that a particular quantity refers to the deterministic setting 
by appending the symbol ; 9 to it. For example, the notation 
E{-} denotes expectation over both 9 and x, i.e., expectation 



in the Bayesian sense, while expectation solely over x (in 
the deterministic setting) is denoted by E{-; 9}. The notation 
E{- | 9} indicates Bayesian expectation conditioned on 9. 

Some further notation used throughout the paper is as fol- 
lows. Lowercase boldface letters signify vectors and uppercase 
boldface letters indicate matrices. The ith component of a 
vector v is denoted vi, while v^ 1 ', v^ 2 ', . . . signifies a sequence 
of vectors. The derivative df/dv of a function f(v) is a 
vector function whose ith element is df /dvi. Similarly, given 
a vector function b(6), the derivative db/dO is defined as the 
matrix function whose (i,j)th entry is dbi/dOj. The squared 
Euclidean norm v T v of a vector v is denoted ||v|| 2 , while 
the squared Frobenius norm Tr(Af M ) of a matrix M is 
denoted ||JVf|||i. In Section [Till we will also define some 
functional norms, which will be of use later in the paper. 

A. The Bayesian-Deterministic Connection 

We now review a fundamental relation between the Bayes- 
ian and deterministic estimation settings. Let 9 be an unknown 
random vector in K™ and let a; be a measurement vector. 
The joint probability density function (pdf) of 9 and x is 
p Xt e(x,6) — p x \ e {x\6)pe{9), where pg is the prior distri- 
bution of 9 and p x \g is the conditional distribution of x given 
9. For later use, define the set 9 of feasible parameter values 
by 

e = {9ER n :p e (9)>0}. (1) 

Suppose 9 = 9(x) is an estimator of 9. Its (Bayesian) MSE 
is given by 



MSE = £;|||0-6»|| 2 } = / \\9-9\\ 2 p Xt e(x,9)dxd9. 
By the law of total expectation, we have 

MSE:: / j / \\9 - 9\\ 2 Pxle (x\9)dx) P0 (9)d9 



(2) 



e{eU\9-9\\ 2 e\\ 



(3) 



Now consider a deterministic estimation setting, i.e., sup- 
pose 9 is a deterministic unknown which is to be estimated 
from random measurements x. Let the distribution p x .g of x 
(as a function of 9) be given by p x -fi{x\9) = p x \ s (x\9), 
i.e., the distribution of x in the deterministic case equals 
the conditional distribution in the corresponding Bayesian 
problem. 

The estimator 9 defined above is simply a function of the 
measurements, and can therefore be applied in the determin- 
istic case as well. Its deterministic MSE is given by 

E\\\9- 9\\ 2 ; 0} = J \\9- 9\\ 2 Px , (x; 9)dx (4) 

Since p x -e(x; 9) = p x \ g (x\9), we have 

eU\9-9\\ 2 -9\ =eU\9-9\\ 2 0}. (5) 

Combining this fact with (01, we find that the Bayesian 
MSE equals the expectation of the MSE of the corresponding 
deterministic problem, i.e. 

E{\\9 - 9\\ 2 } = e{e{\\9 - 9\\ 2 ;9}}. (6) 

This relation will be used to construct the OBB in Section lH-CI 



B. The Deterministic Cramer-Rao Bound 

Before developing the OBB, we review some basic results in 
the deterministic estimation setting. Suppose 8 is a determinis- 
tic parameter vector and let a; be a measurement vector having 
pdf p x .g(x;8). Denote by C M. n the set of all possible 
values of 8. We assume for technical reasons that 9 is an 
open setjj 

Let 8 be an estimator of 8 from the measurements x. We 
require the following regularity conditions to ensure that the 
CRB holds [31, §3.1.3]. 

1) Px-e(x; 8) is continuously differentiable with respect to 
8. This condition is required to ensure the existence of 
the Fisher information. 

2) The Fisher information matrix J (8), defined by 



[J(0)} 



p , dk>g p x . e d\ogp x . e 



86, 



86 j 



(7) 



is bounded and positive definite for all 8 £ 0. This 
ensures that the measurements contain data about the 
unknown parameter. 
3) Exchanging the integral and derivative in the equation 



J t{x) w t PxAx;e)dx = wJ 



t(x)p a 



)(x; 9)dx 

(8) 

is justified for any measurable function t(x), in the sense 
that, if one side exists, then the other exists and the two 
sides are equal. A sufficient condition for this to hold is 
that the support of p x .g does not depend on 8. 
4) All estimators 8 are Borel measurable functions which 
satisfy 

dp x -8 



08 



8 



< g(x) for all 8 



(9) 



for some integrable function g(x). This technical re- 
quirement is needed in order to exclude certain patholog- 
ical estimators whose statistical behavior is insufficiently 
smooth to allow the application of the CRB. 
The bias of an estimator 8 is defined as 



b{8) =E\d-8\ 8. 



(10) 



Under the above assumptions, it can be shown that the bias 
of any estimator is continuously differentiable [5, Lemma 2], 
Furthermore, under these assumptions, the CRB holds, and 
thus, for any estimator having bias b(8), we have 



Ehd-8\\ 2 :d\ >CRB[b,0] 



Tr 



>~^ 



T db \ 
I+ d8) 



\b(8)\\ 



(11) 



A more common form of the CRB is obtained by restricting 
attention to unbiased estimators (i.e., techniques for which 

'This is required in order to ensure that one can discuss differentiability 
of p x g with respect to at any point € 0. In the Bayesian setting to 
which we will return in Section ITl-CI is defined by {T}; in this case, adding 
a boundary to © essentially leaves the setting unchanged, as long as the 
prior probability for to be on the boundary of is zero. Therefore, this 
requirement is of little practical relevance. 



b(8) = 0). Under the unbiasedness assumption, the bound 
simplifies to MSE > Tr(J^ 1 (8)). However, in the sequel we 
will make use of the general form (fTTT i. 

C. A Bayesian Bound from the CRB 

The OBB of Young and Westerberg [5] is based on apply- 
ing the Bayesian-deterministic connection described in Sec- 
tion III- Al to the deterministic CRB ( fTTT i. Specifically, returning 
now to the Bayesian setting, one can combine (O and ( fTTT i to 
obtain that, for any estimator 8 with bias function b(8), 

El\\d - 8\\ 2 \ > Z[b] ^ f CRB[b,8] Pe (d8) (12) 

where the expectation is now performed over both 8 and x. 
Note that ( TT2T i describes the Bayesian MSE as a function of 
a deterministic property (the bias) of 8. Since any estimator 
has some bias function, and since all bias functions are 
continuously differentiable in our setting, minimizing Z[b] 
over all continuously differentiable functions b yields a lower 
bound on the MSE of any Bayesian estimator. Thus, under 
the regularity conditions of Section IH-BI a lower bound on 
the Bayesian MSE is given by 




\\b(dW 



IW'*£ 



Pe {d8) (13) 



where C 1 is the space of continuously differentiable functions 
/ : -> R". 

Note that the OBB differs from the Bayesian CRB of 
Van Trees [3]. Van Trees' result is based on applying the 
Cauchy-Schwarz inequality to the joint pdf p x ,g, whereas the 
deterministic CRB is based on applying a similar procedure 
to p x fi- As a consequence, the regularity conditions required 
for the Bayesian CRB are stricter, requiring that p x g be twice 
differentiable with respect to 8. By contrast, the OBB requires 
differentiability only of the conditional pdf p x \g. An example 
in which this difference is important is the case in which the 
prior distribution pg is discontinuous, e.g., when pg is uniform. 
The performance of the OBB in this setting will be examined 
in Section fVl] 

In the next section, we will see that it is advantageous to 
perform the minimization ( fT3l ) over a somewhat modified class 
of functions. This will allow us to prove the unique existence 
of a solution to the optimization problem, a result which will 
be of use when examining the properties of the bound later in 
the paper. 

Ill, Mathematical Safeguards 

In the previous section, we saw that a lower bound on the 
MMSE can be obtained by solving the minimization problem 
( TT3b . However, at this point, we have no guarantee that the 
solution s of ( fT3l is anywhere near the true value of the 
MMSE. Indeed, at first sight, it may appear that s — for 
any estimation setting. To see this, note that Z[b] is a sum 
of two components, a bias gradient part and a squared bias 
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Fig. 1. A sequence of continuous functions for which both \b(9)\ 2 and 
|1 + b'(6)\ tend to zero for almost every value of 9. 



part. Both parts are nonnegative, but the former is zero when 
the bias gradient is —I, while the latter is zero when the 
bias is zero. No differentiable function b satisfies these two 
constraints simultaneously for all 6, since if the squared bias is 
everywhere zero, then the bias gradient is also zero. However, 
it is possible to construct a sequence of functions b^ 1 ' for 
which both the bias gradient and the squared bias norm tend 
to zero for almost every value of 0. An example of such a se- 
quence in a one-dimensional setting is plotted in Fig. Q] Here, 
a sequence a- 1 ' of smooth, periodic functions is presented. The 
function period tends to zero, and the percentage of the cycle 
in which the derivative equals —1 increases as i increases. 
Thus, the pointwise limit of the function sequence is zero 
almost everywhere, and the pointwise limit of the derivative 
is —1 almost everywhere. 

In the specific case shown in Fig. Q] it can be shown that the 
value of Z\\r'\ does not tend to zero; in fact, Z^ 1 - 1 ] tends 
to infinity in this situation. However, our example illustrates 
that care must be taken when applying concepts from finite- 
dimensional optimization problems to variational calculus. 

The purpose of this section is to show that s > 0, so that 
the bound is meaningful, for any problem setting satisfying the 
regularity conditions of Section IH-BI (This question was not 
addressed by Young and Westerberg [5].) While doing so, we 
develop some abstract concepts which will also be used when 
analyzing the asymptotic properties of the OBB in Section |V1 

As often happens with variational problems, it turns out 
that the minimum of (TT~3T > is not necessarily achieved by any 
continuously differentiable function. In order to guarantee an 
achievable minimum, one must instead minimize ( fT3l l over a 
slightly modified space, which is defined below. As explained 
in Section IH-BI all bias functions are continuously differen- 
tiable, so that the minimizing function ultimately obtained, if 
it is not differentiable, will not be the bias of any estimator. 
However, as we will see, the minimum value of our new 
optimization problem is identical to the infimum of ( fT3b . 



Furthermore, this approach allows us to demonstrate several 
important theoretical properties of the OBB. 

Let L 2 be the space of pg -measurable functions b : — > M" 
such that 



\\b(e)\\ 2 p (M) < oo. 



(14) 



Define the associated inner product 



L 2 



••-i JO 



',(1), 



1,(2) 



bY\e)by>{e)p e {dO) (15) 



and the corresponding norm ||6 



(b, b) L 2- Any function 



b E L 2 has a derivative in the distributional sense, but this 
derivative might not be a function. For example, discontinuous 
functions have distributional derivatives which contain a Dirac 
delta. If, for every i, the distributional derivative dbi/dO of b 
is a function in L 2 , then b is said to be weakly differentiable 
[32], and its weak derivative is the matrix function db/dO. 
Roughly speaking, a function is weakly differentiable if it is 
continuous and its derivative exists almost everywhere. 

The space of all weakly differentiable functions in L 2 is 
called the first-order Sobolev space [32], and is denoted H 1 . 
Define an inner product on H 1 as 

,(1) p.(2) 

b M b ( 2 )\ A( b (D^ — ""' ''"■■ 



H 1 




00 



L 2 

(16) 
The associated norm is ||b||^i = (b,b) H1 . An important 
property which will be used extensively in our analysis is that 
H 1 is a Hilbert space. 

Note that since is an open set, not all functions in C 1 are 
in H , For example, in the case = M", the function b{9) = 
k, for some nonzero constant k, is continuously differentiable 
but not integrable. Thus b is in C 1 but not in H 1 , nor even 
in L 2 . However, any measurable function which is not in H 1 
has ||&||iji = oo, meaning that either b or db/dO has infinite 
L 2 norm. Consequently, either the bias norm part or the bias 
gradient part of Z[b] is infinite. It follows that performing the 
minimization ( fT3l over C 1 n H 1 , rather than over C 1 , does 
not change the minimum value. On the other hand, C 1 C\H 1 is 
dense in H 1 , and Z[b] is continuous, so that minimizing ( fT3l 
over H 1 rather than C 1 OH 1 also does not alter the minimum. 
Consequently, we will henceforth consider the problem 



inf Z\b\. 
bem 



(17) 



The advantage of including weakly differentiable functions 
in the minimization is that a unique minimizer can now be 
guaranteed, as demonstrated by the following result. 

Proposition 1: Consider the problem 



b = argmin Z[b] 

beH 1 



(18) 



where Z[b] is given by ( fT2l and J(9) is positive definite 
and bounded with probability 1 . This problem is well-defined, 
i.e., there exists a unique b S H 1 which minimizes Z[b]. 
Furthermore, the minimum value s = Z[b] is finite and 
nonzero. 

Proving the unique existence of a minimizer for ( fTTI i is a 
technical exercise in functional analysis which can be found in 



Appendix |nl However, once the existence of such a minimizer 
is demonstrated, it is not difficult to see that < s < oo. To 
see that s < oo, we must find a function b for which Z[b] < 
oo. One such function is b = 0, for which Z[b] is finite since 
J(9) is bounded. Now suppose by contradiction that a = 0, 
which implies that there exists a function b e H 1 such that 
Z[b] = 0. Therefore, both the bias gradient and the squared 
bias parts of Z[b] are zero. In particular, since the squared bias 
part equals zero, we have ||6|| L 2 = 0. Hence, 6 = 0, because 
L 2 is a normed space. But then, by the definition ([T2l of Z[], 



Z[b] = / Tr{J-\e))pB(d0) 



(19) 



which is positive; this is a contradiction. 

Note that functions in H 1 are defined up to changes on a set 
having zero measure. In particular, the fact that o^' is unique 
does not preclude functions which are identical to b^ ' almost 
everywhere (which obviously have the same value Z[b\). 

Summarizing the discussion of the last two sections, we 
have the following theorem. 

Theorem 1: Let 9 be an unknown random vector with pdf 
pe(0) > over the open set C R™, and let x be a 
measurement vector whose pdf, conditioned on 9, is given by 
p x \ e (x\6). Assume the regularity conditions of Section Bl-BI 
hold. Then, for any estimator 9, 



E 



{\\9-9f} 



> min 

beH 1 



CRB[b,9}p e (9)d9. (20) 



The minimum in ( f20b is nonzero and finite. Furthermore, this 
minimum is achieved by a function beH 1 , which is unique 
up to changes having zero probability. 

Two remarks are in order concerning Theorem Q] First, 
the function b solving ( f20b might not be the bias of any 
estimator; indeed, under our assumptions, all bias functions are 
continuously differentiable, whereas b need only be weakly 
differentiable. Nevertheless, ( f20b is still a lower bound on 
the MMSE. Another important observation is that Theorem [TJ 
arises from the deterministic CRB; hence, there are no require- 
ments on the prior distribution pg{9). In particular, pg(9) can 
be discontinuous or have bounded support. By contrast, many 
previous Bayesian bounds do not apply in such circumstances. 

IV. Calculating the Bound 

In finite-dimensional convex optimization problems, the 
requirement of a vanishing first derivative results in a set 
of equations, whose solution is the global minimum. Analo- 
gously, in the case of convex functional optimization problems 
such as (l20i i. the optimum is given by the solution of a set of 
differential equations. The following theorem, whose proof can 
be found in Appendix Hill specifies the differential equation 
relevant to our optimization problem. 

In this section and in the remainder of the paper, we will 
consider the case in which the set = {6 : pe(9) > 0} is 
bounded. From a practical point of view, even when consists 
of the entire set M™, it can be approximated by a bounded set 
containing only those values of 6 for which pg (9) > e. 

Theorem 2: Under the conditions of TheoremQ] suppose 
is a bounded subset of l n with a smooth boundary A. Then, 



the optimal b(0) of d20i > is given by the solution to the system 
of partial differential equations 

Pe{ eMe) =p e (9)Y,^ Fk (J- 1 ) ]k 



3,k 



0,k 



Obi 



d9 k 



in dpe 



ih^r'C** 



89, 



88; 



(21) 



for i — 1, . . . n, within the range 9 e 0, which satisfies the 
Neumann boundary condition 



db\ , 



T+ w )J- 1 u(0) = 



(22) 



for all points 9 G A. Here, v(9) is a normal to the boundary 
at 9. All derivatives in this system of equations are to be 
interpreted in the weak sense. 

Note that Theorem [TJ guarantees the existence of a unique 
solution in H 1 to the differential equation d2"TT i with the 
boundary conditions d22i i. 

The bound of Young and Westerberg [5] is a special case 
of Theorem |2 and is given here for completeness. 

Corollary 1: Under the settings of Theorem [T] suppose 
= {$o, 9x) is a bounded interval in R. Then, the bias 
function b(9) minimizing d20l i is a solution to the second-order 
ordinary differential equation 



■7(0)6(0) = 6"(0) + (1 + b'{9)) 



d log pg dlogj 



d$ 



dO 



(23) 



within the range 9 £ 0, subject to the boundary conditions 
b'(9 ) = V{9i) = -1. 

Theorem [2] can be solved numerically, thus obtaining a 
bound for any problem satisfying the regularity conditions. 
However, directly solving d2"TT > becomes increasingly complex 
as the dimension of the problem increases. Instead, in many 
cases, symmetry relations in the problem can be used to 
simplify the solution. As an example, the following spherically 
symmetric case can be reduced to a one-dimensional setting 
equivalent to that of Corollary Q] The proof of this theorem 
can be found in Appendix |IV] 

Theorem 3: Under the setting of Theorem Q] suppose that 
= {9 : \\9\\ < r} is a sphere centered on the origin, pe(9) = 
q(\\9\\) is spherically symmetric, and J (9) = J(\\0\\)I, where 
J : K. — > M. is a scalar function. Then, the optimal-bias bound 
(|20T i is given by 



E 



{\\9-9f} 



> 



2W 2 
r(n/2) ,„ 

n — 1 



•HP) 



1 



b 2 (p) 

p 



(i + frW 

J(p) 

q(p)p n - 1 dp. (24) 



Here, T(-) is the Gamma function, and b(p) is a solution to 
the ODE 

+ (l + b'(6))(^-^-) (25) 



d,9 



d.e 



subject to the boundary conditions 6(0) = 0, b'(r) = — 1, The 
bias function for which the bound is achieved is given by 



b(o) = b(\\e\\ 



6 

m\ 



(26) 



In this theorem, the requirement J{9) = J{\\9\\)I indicates 
that the Fisher information matrix is diagonal and that its 
components are spherically symmetric. Parameters having a 
diagonal matrix J are sometimes referred to as orthogonal. 
The simplest case of orthogonality occurs when, to each 
parameter 8i, there corresponds a measurement Xi, in such 
a way that the random variables Xi\9 are independent. Other 
orthogonal scenarios can often be constructed by an appropri- 
ate parametrization [33]. 

The requirement that J have spherically symmetric compo- 
nents occurs, for example, in location problems, i.e., situations 
in which the measurements have the form x = 9 + w, where 
w is additive noise which is independent of 0. Indeed, under 
such conditions, J is constant in [31, §3.1.3]. If, in addition, 
the noise components are independent, then this setting also 
satisfies the orthogonality requirement, and thus application of 
Theorem [3] is appropriate. Note that this estimation problem 
is not separable, since the components of are correlated; 
thus, the MMSE in this situation is lower than the sum of 
the components' MMSE. An example of such a setting is 
presented in Section rVTl 

V. Properties 

In this section, we examine several properties of the OBB. 
We first demonstrate that the optimal bias function has zero 
mean, a property which also characterizes the bias function of 
the MMSE estimator. Next, we prove that, under very general 
conditions, the resulting bound is tight at both low and high 
SNR values. This is an important result, since a desirable 
property of a Bayesian bound is that it provides an accurate 
estimate of the ambiguity region between high and low SNR 
[11]. Reliable estimation at the two extremes increases the 
likelihood that the transition between these two regimes will 
be correctly identified. 

A. Optimal Bias Has Zero Mean 

In any Bayesian estimation problem, the bias of the MMSE 
estimator opt = E{0\x} has zero mean: 

£{£o P t} = E{E{9\x}} = E{0} (27) 



so that 



£{b(0 opt )} = E{E{0\x} 0}=O. 



(28) 



Thus, it is interesting to ask whether the optimal bias which 
minimizes ( 1201 also has zero mean. This is indeed the case, 
as shown by the following theorem. 

Theorem 4: Let b(0) be the solution to d20l . Then, 

E{b(0)} = 0. (29) 

Proof: Assume by contradiction that b(0) has nonzero 

mean E{b{0)} = fi ^ 0. Define b o (0) = b{0) - /x. From 



(fTTT l. we then have 

CRB[6 ,6>] - CRB[b,0] = l|bo(0)|| 2 - ll&WII 2 

= \\ t i\\ 2 -2 fJ , T b(0). (30) 

Using the functional Z[-] defined in ( [12] ). we obtain 

Z{b ] - Z{b] = E{\\»\\ 2 - 2fj, T b(0)} 
= M 2 -2fi T E{b(0)} 
= -M| 2 <0. (31) 

Thus Z[bo] < Z[b], contradicting the fact that b{0) minimizes 



B. Tightness at Low SNR 

Bell et al. [11] examined the performance of the extended 
Ziv-Zakai bound at low SNR and demonstrated that, for 
a particular family of distributions, the extended Ziv-Zakai 
bound achieves the MSE of the optimal estimator as the SNR 
tends to 0. We now examine the low-SNR performance of the 
OBB, and demonstrate tightness for a much wider range of 
problem settings. 

Bell et al. did not define the general meaning of a low SNR 
value, and only stated that "[a]s observation time and/or SNR 
become very small, the observations become useless . . . [and] 
the minimum MSE estimator converges to the a priori mean." 
This statement clearly does not apply to all estimation prob- 
lems, since it is not always clear what parameter corresponds 
to the observation time or the SNR. We propose to define 
the zero SNR case more generally as any situation in which 
J (9) = with probability 1. This definition implies that the 
measurements do not contain information about the unknown 
parameter, which is the usual informal meaning of zero SNR. 
In the case J(9) = 0, it can be shown that the MMSE 
estimator is the prior mean, so that our definition implies the 
statement of Bell et al. 

The OBB is inapplicable when J(9) = 0, since the CRB 
is based on the assumption that J(9) is positive definite. To 
avoid this singularity, we consider a sequence of estimation 
settings which converge to zero SNR. More specifically, we 
require all eigenvalues of J (9) to decrease monotonically to 
zero for pg -almost all 9. The following theorem, the proof of 
which can be found in Appendix|V] demonstrates the tightness 
of the OBB in this low-SNR setting. 

Theorem 5: Let 9 be a random vector whose pdf pe(9) is 
nonzero over an open set 9 C R™. Let x^\ x^ 2 \ ... be a se- 
quence of observation vectors having finite Fisher information 
matrices J^ (9), J^ '(9), . . ., respectively. Suppose that, for 
all N, the matrix j' ' (9) is positive definite for pg-almost all 
9, and that all eigenvalues of J 1 - '(9) decrease monotonically 
to zero as N — > oo for pg-almost all 9. Let /3n denote the 
optimal-bias bound for estimating 9 from x^ N \ Then, 

lim f3 N = e{\\9 - E{9}\\ 2 \ . (32) 



C. Tightness at High SNR 

We now examine the performance of the OBB for high 
SNR values. To formally define the high SNR regime, we 
consider a sequence of measurements asW , x^ 2 \ . . . of a single 
parameter vector 0. It is assumed that, when conditioned on 
0, all measurements x^' are identically and independently 
distributed (IID). Furthermore, we assume that the Fisher in- 
formation matrix of a single observation J(0) is well-defined, 
positive definite and finite for p#-almost all 0. We consider 
the problem of estimating from the set of measurements 
{x^\...,x {N ^}, for a given value of N. The high SNR 
regime is obtained when N is large. 

When N tends to infinity, the MSE of the optimal estimator 
tends to zero. An important question, however, concerns the 
rate of convergence of the minimum MSE. More precisely, 

(1) ,...,* (Ar) }, 



(N) 

given the optimal estimator 8 of from {x 



one would like to determine the asymptotic distribution of 
yN(8 — 8), conditioned on 8. A fundamental result of 
asymptotic estimation theory can be loosely stated as follows 
[28, §111.3], [13, §6.8]. Under some fairly mild regularity 
conditions, the asymptotic distribution of y/~N(0 — 8), 
conditioned on 8, does not depend on the prior distribution 

Pe; rather, y/N(8 — 6) | 8 converges in distribution to 
a Gaussian random vector with mean zero and covariance 
J~ l {8). It follows that 

Jim NE{\\d iN) - 8\\ 2 } = E{Tr[J-\8)}} . (33) 

Since the minimum MSE tends to zero at high SNR, 
any lower bound on the minimum MSE must also tend to 
zero as JV — ► oo. However, one would further expect a 
good lower bound to follow the behavior of d33l . In other 
words, if (3n represents the lower bound for estimating 8 



(i) 



, x^}, a desirable property is N/3 



N 



from {x 

E{Tt[J^ 1 (8)}Y The following theorem, whose proof is 
found in Appendix [V] demonstrates that this is indeed the 
case for the OBB. 

Except for a very brief treatment by Bellini and Tartara 
[6], no previous Bayesian bound has shown such a result. 
Although it appears that the Ziv-Zakai and Weiss-Weinstein 
bounds may also satisfy this property, this has not been proven 
formally. It is also known that the Bayesian CRB is not 
asymptotically tight in this sense [34, Eqs. (37)— (39)]. 

Theorem 6: Let 8 be a random vector whose pdf pe{8) 
is nonzero over an open set O C W 1 . Let x^\ x^ 2 \ ... be a 
sequence of measurement vectors, such that a;' 1 ) |0, x^\8, . . . 
are IID. Let J (8) be the Fisher information matrix for 
estimating 8 from x^\ and suppose J{8) is finite and positive 
definite for pg-almost all 8. Let Pm be the optimal-bias 
bound d20l for estimating 8 from the observation sequence 
{x M,..., x W}. Then, 

lim Nf3 N = E{Tr(J- l (8))\ . (34) 

N—*-oo 

Note that for Theorem |6] to hold, we require only that 
J (9) be finite and positive definite. By contrast, the various 
theorems guaranteeing asymptotic efficiency of Bayesian esti- 
mators all require substantially stronger regularity conditions 



[28, §111.3], [13, §6.8]. One reason for this is that asymptotic 
efficiency describes the behavior of 8 conditioned on each 
possible value of 8, and is thus a stronger result than the 
asymptotic Bayesian MSE of d33l . 

VI. Example: Uniform Prior 

The original bound of Young and Westerberg [5] predates 
most Bayesian bounds, and, surprisingly, it has never been 
cited by or compared with later results. In this section, we 
measure the performance of the original bound and of its 
extension to the vector case against that of various other 
techniques. We consider the case in which 8 is uniformly 
distributed over an n-dimensional open ball 9 = {8 : \\8\\ < 
r} C R", so that 



Pe(0) 



1 



Vn(r) 



le 



where Is equals 1 when 6eS and otherwise, and 

_n/2 n— 1 

V n (r)= ' 



(35) 



(36) 



r(l + n/2) 

is the volume of an n-ball of radius r [35]. We further assume 
that 

x = 8 + w (37) 

where w is zero-mean Gaussian noise, independent of 8, 
having covariance a 2 1. We are interested in lower bounds on 
the MSE achievable by an estimator of 8 from x. 

We begin by developing the OBB for this setting, as well 
as some alternative bounds. We then compare the different 
approaches in a one-dimensional and a three-dimensional 
setting. 

The Fisher information matrix for the given estimation 
problem is given by J(8) — a~ 2 I, so that the conditions 
of Theorem [3] hold. It follows that the optimal bias function 
is given by 6(0) = 6(||0||)0/||0||, where b(-) is a solution to 
the differential equation 



1) 



V 



(38) 



with boundary conditions 6(0) = 0, b'(r) = — 1. The general 
solution to this differential equation is given by 

b{6) = C 1 1 -"/ 2 /„ /2 (-) + C 2 e l - n ' 2 K n/2 (-) (39) 



^/2\ "I +L-2P ' J^n/ () 



where I a (z) and K a [z) are the modified Bessel functions 
of the first and second types, respectively [36]. Since K a (z) 
is singular at the origin, the requirement 6(0) = leads to 
C*2 = 0. Differentiating d39l with respect to 8, we obtain 



(40) 



so that the requirement b'(r) = — 1 leads to 

r n/2 



C x = - 



4/2<7/er) + r/al 1+n/2 (r/a) ' 



(41) 



Substituting this value of b(-) into (l24l yields the OBB, which 
can be computed by evaluating a single one-dimensional in- 
tegral. Alternatively, in the one-dimensional case, the integral 
can be computed analytically, as will be shown below. 

Despite the widespread use of finite-support prior distri- 
butions [4], [10], the regularity conditions of many bounds 
are violated by such prior pdf functions. Indeed, the Bayesian 
CRB of Van Trees [3], the Bobrovski-Zakai bound [8], and the 
Bayesian Abel bound [12] all assume that pe(9) has infinite 
support, and thus cannot be applied in this scenario. 

Techniques from the Ziv-Zakai family are applicable to 
constrained problems. An extension of the Ziv-Zakai bound 
for vector parameter estimation was developed by Bell et al. 
[11]. From [11, Property 4], the MSE of the ith component 
of 9 is bounded by 



{(Or-kf) 



E\ (9,. - 9,Y \> V\ max A(8)P min {6) \hdh 

(42) 
where e% is a unit vector in the direction of the ith component, 
V{-} is the valley-filling function defined by 



V{f(h)} = maxf(h + V ), 

r/>0 



(43) 



A(5)± [ mm(pg(e),pg(9 + 8))de, (44) 

and -Pmin(^) is the minimum probability of error for the 
problem of testing hypothesis Ho : 9 = 6q vs. Hi : 9 = 
9 + S. In the current setting, -P m in(<$) is given by P m i n (8) = 
Q(||<5||/2<t), where Q(z) = {2tt)~ 1 ' 2 f™ e~ e l 2 dt is the tail 
function of the normal distribution. Also, we have 

V^r, ||*||) 



where 



A(8) 



V°(r,h) 



Vn(r) 



lele+heid0 



(45) 



(46) 



and 6 + hex = {9 + he x : 9 e 6}. Thus, V°(r,h) is the 
volume of the intersection of two n-balls whose centers are 
at a distance of h units from one another. Substituting these 
results into (l42l . we have 



E 



{(ft -ft) 3 } 



> 



V 



max ^(r\\S\\) 
6:eTS=h ' ' 



V n (r) 



Mi 

2a 



h dh. (47) 



Note that both V£ (r, ||<5||) and Q(\\S\\/2a) decrease with ||5||. 
Therefore, the maximum in J47] > is obtained for 8 = hei. 
Also, since the argument of V{-} is monotonically decreasing, 
the valley-filling function has no effect and can be removed. 
Finally, since V^(r,h) — for h > 2r, the integration can 
be limited to the range [0, 2r], Thus, the extended Ziv-Zakai 
bound is given by 



E{\\9-9f}>^ 



v M -w (48) 

We now compute the Weiss- Weinstein bound for the setting 
at hand. This bound is given by 



eU\9 - 0|| 2 } > Tr(H G l H 7 



(49) 



where H = [hi, . . . , h m ] is a matrix containing an arbitrary 
number m of test vectors and G is a matrix whose elements 
are given by 



Gi 



E{r{x, 9; hi, Si)r{x, 9; h h Sj)} 



(50) 



Tv E{L S - (x; 9 + hi, 9)} E{L s i (x; 9 + h. } ,9)} 
in which 

r(cc,0;/i l ,s l ) = L s '(x;9 + h i ,9)-L 1 - s '(x;9-h l ,9) (51) 



and 



L(a?;0i,0 2 )4 



ye(9i)p x \ e (x\9i) 
Pe{92)p x \e{x\9 2 )' 



(52) 



The vectors hi, ... , h m and the scalars s%, . . . , s m are arbi- 
trary, and can be optimized to maximize the bound (l49l . To 
avoid a multidimensional nonconvex optimization problem, we 
restrict attention to m = n, hi — hei, and s, = 1/2, as 
suggested by [10]. This results in a dependency on a single 
scalar parameter h. 

Under these conditions, Gij can be written as 

G - - M(h t W(h 3 ) ^ ~ H - "^ + A>( ^ ~ h ^ ] 
- M(h % + hj,hj) - M(hi + hjM)] (53) 

where 

M(h)=Et.L 1/2 (x;9 + h,9)\ (54) 

and 

M(h u h 2 ) = E{L^ 2 {x; 9 + hi, 0)le + h 2 } • (55) 

Note that we have used the corrected version of the Weiss- 
Weinstein bound [37]. Substituting the probability distribution 
of x and 9 into the definitions of M(h) and M(h\,h2), we 
have 



(h) = B{e-ll fl+h -"M a / 4 ^ell fl -"ll a / 4ffa le+h} 



M(h) = 



__ Vn(r,\\h\\) p -\\hf/8a 2 

V n {r) 



(56) 



and, similarly, 



M(h u h2) = 



\\hl\\ 2 



V n {r) 



tet e+hl t e+h2 d9. (57) 



Thus, M(h) is a function only of \\h\\, and M{h\,h2) is a 
function only of ||/ii||, \\h2W, and ||hi — feall- Since hi = he^, 
it follows that, for i ^ j, the numerator of (1531 ) vanishes. Thus, 
G is a diagonal matrix, whose diagonal elements equal 



(jr ; 



( M(0,/iei)-M(2/iei,/iei) 

! M 2 (he x ) ' 



(58) 



The Weiss- Weinstein bound is given by substituting this result 
into d49l and maximizing over h, i.e., 

nh 2 M 2 {h ei ) 
[o,2r] 2[M(0, hei) - M(2he 1 ,he 1 )J ' 

The value of h yielding the tightest bound can be determined 
by performing a grid search. 



e{\\9-9\\ 2 \ > ma 
I ) ftero, 





1 


1 ' 








0.3 


— — — Optimal-bias bound 

Weinstein-Weiss 
■ — ■ — ■ Bellini-Tartara 


0.2b 
0.2 

0.15 
0.1 

0.05 


\ \ 
\ 





-20 



-10 




SNR (dB) 

(a) 



10 



20 



1 0.95 

re 

o 
re 

"g 0.9 

re 
■a 



-£ 0.85 

<D 

o 0.8 

re 
cr 



0.75 







s 


/ / 


N 




N 


/ .•• 


\ 


/ .-' 


\ 


. /• ' 


\ 


/ 


\ 


/ 


\ y 








- — — Optimal-bias bound 

Weiss-Weinstein 

— — Bellini-Tartara 




, 




, , 





-20 



-10 




SNR (dB) 

(b) 



10 



20 



Fig. 2. Comparison of the MSE bounds and the minimum achievable MSE in a one-dimensional setting for which 9 ~ U[—r, r] and x\9 ~ N(9, a 2 ). 



To compare the OBB with the alternative approaches de- 
veloped above, we first consider the one-dimensional case in 
which 9 is uniformly distributed in the range 9 = (— r, r). Let 
x = 9 + w be a single noisy observation, where w is zero- 
mean Gaussian noise, independent of 9, with variance a 2 . We 
wish to bound the MSE of an estimator of 9 from x. 

The optimal bias function is given by (f39b . Using the fact 
that I\/2{t) = \/2/7r sinh.(t) / \Jt, we obtain 



b(9) 



sinh(6»/cr) 
cosh(r/cr) 



(60) 



which also follows [5] from Corollary Q] Substituting this 
expression into d20l i. we have that, for any estimator 9, 



E 



{{9-9f} 



>a z 1 



tanh(r/cr) 
r/a 



(61) 



Apart from the reduction in computational complexity, the 
simplicity of doTT i also emphasizes several features of the 
estimation problem. First, the dependence of the problem 
on the dimensionless quantity r/a, rather than on r and a 
separately, is clear. This is to be expected, as a change in units 
of measurement would multiply both r and a by a constant. 
Second, the asymptotic properties demonstrated in Theorems 
|5]and|6]can be easily verified. For r ^> a, the bound converges 
to the noise variance a 2 , corresponding to an uninformative 
prior whose optimal estimator is 9 — x; whereas, for a 3> r, 
a Taylor expansion of t&nh(z)/z immediately shows that 
the bound converges to r 2 /3, corresponding to the case of 
uninformative measurements, where the optimal estimator is 
9 = 0. Thus, the bound doTT i is tight both for very low and for 
very high SNR, as expected. 

In the one-dimensional case, we have V\ (r) = 2r and 
Vf (r, h) = max(2r — h, 0), so that the extended Ziv-Zakai 
bound J48l ) and the Weiss-Weinstein bound $5% can also be 
simplified somewhat. In particular, the extended Ziv-Zakai 



bound d48b can be written as 



E 



{\\e-e\\ 2 } 



> 



2r 



£)•*(*)■'"■ 



(62) 



Using integration by parts, (|62l becomes 



"{■•-»""}*t< 



■3/2 



r 
2^2 



8 



3V2TTT 



2a 2 



(63) 



where T a (z) = {1/T(a)) J^ e'H a ' 1 dt is the incomplete 
Gamma function. Like the expression doTt for the OBB, this 
bound can be shown to converge to the noise variance a 2 when 
r ^> a and to the prior variance r 2 /3 when a ^> r. However, 
while the convergence of the OBB to these asymptotic values 
has been demonstrated in general in Theorems [5] and [6] the 
asymptotic tightness of the Ziv-Zakai bound in the general 
case remains an open question. 

The Weiss-Weinstein bound (T59b can likewise be simplified 
further in the one-dimensional case, yielding 



i{\\o-6\\ 2 } 



> max — 

h£[0,2r] 2 I 



-h 2 /4a 2 



(1 



2rJ 



1_ A 

L 1r 



max (0, 1 - A) e- h2 / 2 * 2 



(64) 



However, calculating this bound still requires a numerical 
search for the optimal value of h. 

These bounds are compared with the exact value of the 
MMSE in Fig. |2] In this figure, the SNR is defined as 



SNR(dB) = 101og 10 (^^) = 



101og 10 h— . (65) 



510 VVar(»; 

The MMSE was computed by Monte Carlo approximation of 
the error of the optimal estimator E{9\x}, which was itself 
computed by numerical integration. Fig. |2(a)| plots the MMSE 
and the values obtained by the aforementioned bounds, while 
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Fig. 3. Comparison of the MSE bounds and the minimum achievable MSE in a three-dimensional setting for which is uniformly distributed over a ball 
of radius r and x\0 ~ N(0, a 2 1). 



Fig. |2(b)| plots the ratio between each of the bounds and the 
actual MMSE in order to emphasize the difference in accuracy 
between the various bounds. As can be seen from this figure, 
the OBB is closer to the true MSE than all other bounds, for 
all tested SNR values. 

The improvements provided by the OBB continue to hold 
in higher dimensions as well, although in this case it is not 
possible to provide a closed form for any of the bounds. For 
example, Fig. [3] compares the aforementioned bounds with the 
true MMSE in the three-dimensional case. In this case the SNR 
is given by 



SNR(dB) 



l°lo 8l „(S|) 



101o gl 



r 
5o^ 



(66) 



\Var(ii>) 

Here, computation of the minimum MSE requires multi- 
dimensional numerical integration, and is by far more compu- 
tationally complex than the calculation of the bounds. Again, it 
is evident from this figure that the OBB is a very tight bound 
in all ranges of operation, and is considerably closer to the 
true value than either of the alternative approaches. 

VII. Conclusion 
Although often considered distinct settings, there are in- 
sightful connections between the Bayesian and deterministic 
estimation problems. One such relation is the use of the 
deterministic CRB in a Bayesian problem. The application 
of this deterministic bound to the problem of estimating the 
minimum Bayesian MSE results in a Bayesian bound which 
is provably tight at both high and low SNR values. Numerical 
simulation of the location estimation problem demonstrates 
that the technique is both simpler and tighter than alternative 
approaches. 
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Appendix I 
Some Technical Lemmas 

The proof of several theorems in the paper relies on the 
following technical results. 

Lemma 1: Consider the minimization problems 



M t = inf Z t [b], 

b£S 



= 1,2,3 
where J (9) is positive definite and bounded a.e. (pg), 



(67) 



Z 1 [b] 4 j \\b(9)fp e (d9) 

Z a [b]±Z 1 [b] + Z 2 [b] 



§W'*i 




and S C H 1 is convex, closed, and bounded under the H 1 
norm ( fl6] l. Then, for each (., there exists a function b^°> <G S 
such that Z[6 (0) ] = Mi. If £ = 1 or I = 3, then the minimizer 
of ( l67l is unique. 

Note that Z 3 [b] equals Z[b] of (O; the notation Z 3 [b] is 
introduced for simplicity. Also note that under mild regularity 
assumptions on J (9), uniqueness can be demonstrated for £ = 
2 as well, but this is not necessary for our purposes. 

Proof: The space H 1 is a Cartesian product of n Sobolev 
spaces H 1 (Q), each of which is a separable Hilbert space 
[38, §3.7.1]. Therefore, H 1 is also a separable Hilbert space. 
It follows from the Banach-Alaoglu theorem [39, §3.17] that 
all bounded sequences in H 1 have weakly convergent subse- 
quences [32, §2.18]. Recall that a sequence jp ' , jp , ■ • • € 



H 1 



is said to converge weakly to f^ ' 

-/(")) if 

L[f U} ] - i[/ (0) ] 



€ H 1 (denoted 



(69) 



11 



for all continuous linear functionals L[-] [32, §2.9]. 

Given a particular value £ G {1, 2, 3}, let a- 1 ' be a sequence 
of functions in S such that Zf[b- 1 '} — » M^. This is a bounded 
sequence since 5 is bounded, and therefore there exists a 
subsequence 6'**" which converges weakly to some b^J G iJ 1 . 
Furthermore, since 5 is closed] 2 ] we have bL, G 5. We will 
now show that ^[6^,] = Mi. 

To this end, it suffices to show that Zi[-] is weakly lower 
semicontinuous, i.e., for any sequence f^' G H 1 which 
converges weakly to f^ ' G H 1 , we must show that 



Z,[/ (0) ] < liminf Z £ [/ 



Mi 



(70) 



Consider a weakly convergent sequence /^ —*■ f ■ Then, 
( |69l holds for any continuous linear functional L[], Specifi- 
cally, choose the continuous linear functional 

Li[f}= f f {O) (0)f(e) Pe (d0). (71) 

Je 

We then have 



lim Li[/ 



(:;)i 



« n 

-*°° Jerri 



< liminf 



fW(8)P Pe (d8)- / \\f U HO)\\ 2 Po(dO) 
e Je 




Zi[/<°>] liminf V^i[/ a) ] 



(72) 



where we have used the Cauchy-Schwarz inequality. It follows 
that 



Zi[/W]<liminfJZi[/«] 



(73) 



and therefore Zi[/ (0) ] < lim inf .,•_»«, Zi[/ W) ], so that Z x [] 
is weakly lower semicontinuous. 

Similarly, consider the continuous linear functional 




for which we have 

Z 2 [f W ]=L 2 [f^} 
= lim L 2 [/«] 



lim / Tr 
J-»°° Je 



(0)1 I 



(74) 




Pe (d0). (75) 



2 In fact, we require that S be "weakly closed" in the sense that weakly 
convergent sequences in S converge to an element in S. However, since S 
is convex, this notion is equivalent to the ordinary definition of closure [39, 
§3.13]. 



Note that, for any positive definite matrix W, Tr (AW B ) 
is an inner product of the two matrices A and B. Therefore, 
by the Cauchy-Schwarz inequality, 



Ti(AWB T ) < \JTy{AWA t )Ty{BWB t ). (76) 
Applying this to d75l l. we have 

Z2[f {0) ] < liminf 



\ 




J-\0)[I 



Of 



(or 



80 



\ 




J- 1 (0)[l 



Of 



tiY 



00 



\pe(d0). 



(77) 



Once again using the Cauchy-Schwarz inequality results in 

Z 2 [/ (0) ] < liminf J Z 2 [fW]Z 2 [fW] (78) 

and therefore Z 2 [/ (0) ] < liminf^oo Z 2 [f (j) ], so that Z 2 \\ is 
weakly lower semicontinuous. Since Z$\f\ = Z\\f\ + Z 2 [f], 
it follows that Z%\\ is also weakly lower semicontinuous. 



Now recall that b [ 



b$ and Z e [b {lk) ] -► M e . By the 



definition ( TTOb of lower semicontinuity, it follows that 



Zf\b { n e 2] < liminf Z f \b {tk) ] = M, 



■'opt 



fc^c 



(79) 



Mi = 

opt J 



M. 



and since Mj is the infimum of Zi[b], we obtain Z\b t 

Thus b(, p | is a minimizer of (|67] |. 

It remains to show that for I G {1,3}, the minimizer 
of d67l i is unique. To this end, we first show that Z\\\ is 
strictly convex. Let b^°\b^ G S be two essentially different 
functions, i.e., 

p e ({<? G 9 : b (o) (0) ^ b (1) (6>)}) > 0. (80) 

Let 6 (2) (0) = Ab (0) (0) + (1 - A)6 (1) (0) for some < A < 1, 
so that b^ ' G S by convexity. We then have 

Zi[6< 2 >] = / Ab (0) (0) + (1 - A)6«(0) % e (d0) 
JQ 

+ / A6 (0) (0) + (1 - A)6« (0) 2 Pe (d0) 
Je\Q 

A||6(°)(0)|| 2 + (1-A)||6«(0)|| 2 ]p e (0) 
A||6( o )(0)|| 2 + (l-A)||6W(0)|| 2 lp e W 



< 



Q 



e\Q L 
AZi[6 (0) ] + (l-A)Z 2 [6 w ] 



',(i)i 



(81) 



where the inequality follows from strict convexity of the 
squared Euclidean norm ||a;|| 2 . Thus Z\\\ is strictly convex, 
and hence has a unique minimum. 

Note that Z 3 [b] = Zi\b] + Z 2 [b\. Since Z x \\ is strictly 
convex and Z 2 \\ is convex, it follows that Z%\-\ is strictly 
convex, and thus also has a unique minimum. This completes 
the proof. ■ 
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The following lemma can be thought of as a triangle 
inequality for a normed space of matrix functions over 9. 

Lemma 2: Let pg be a probability measure over 9, and let 
M : 9 — > M" xn be a matrix function. Suppose 



I + M(0)f F pe(dB) < a 
for some constant a. It follows that 

\M(e)\\ 2 FPe (de)<(V^ + V^) 2 - 



Proof: By the triangle inequality, 

|M(0)|| F = \\M(6) + I-I\\ F < \\M{8) + I\ 



(82) 



(83) 



\I\\f- 
(84) 



Since ||/||^ = n, we have 



\\M(9)\\ F p e (de) 



< 



\I + M(9)\\ F 



■2y/Z\\I + M(0)\\ F \pe(d0). 

(85) 



Using the fact that 

\I + M(0)\\ F p e (dO)< 
e 




M(9)\\ F p e (d9) 



(86) 
and combining with ( |82i >, it follows that 

/ \\M{9)\\ F p (d9)<a + n + 2^ (87) 

Je 

which completes the proof. ■ 

Appendix II 
Proof of ProposftionQ] 

The following proof of Proposition Q] makes use of the 
results developed in Appendix U 

Proof: [Proof of Proposition Q] Recall that Z 3 [b] of ( f68l > 
equals Z[b], Thus, we would like to apply Lemma Q] (with 
t = 3) to prove the unique existence of a minimizer of 
(1171 , However, Lemma Q] requires that the minimization be 
performed over a closed, bounded, and convex set S, whereas 
( flTt is performed over the unbounded set H 1 . To resolve 
this issue, we must show that the minimization ( TTTb can be 
reformulated as a minimization over a closed, bounded, and 
convex set S. 

To this end, note that 



Z[0}= f Tr{J- 1 (e)) Pe (de)^U 
Je 



(88) 



and therefore M < U < oo. Thus, it suffices to perform the 
minimization ( fTTI i over those functions for which Z[b] < [/. 
We now show that this can be achieved by minimizing over 
a closed, bounded, and convex set S. First, note that Z[b] > 
\\b\\ 2 L 2, so that one may choose to minimize ( TT7| > only over 
functions b for which 



\b\\h<U. 



(89) 



Similarly, we have 



W.*(( J+ £) J ~ 1( ' ) 



09 



Pe{dO) 



(90) 
so that it suffices to minimize dTvT > over functions b for which 

] Pe(d8) < U. 



A(>^yH'^ 



(91) 
Note that J (9) is bounded a.e., and therefore A m ; n (J 1 ) > 
1/K a.e., for some constant K. It follows that 




j-\e)(i 



> 



K 



db 

08 



r Ob 

I+ 09 



a.e.(p e ). (92) 



Combining with d9Tb yields 



Ob 



09 



Pe(d9) < KU. 



From Lemma |2] we then have 



Ob 



09 



Pe{d9) < 



'KU 



(93) 



(94) 



From d89i > and d94T > it follows that the minimization (fTTI i can 
be limited to the closed, bounded, convex set 



S= IbeH 1 : \\b\\ 2 H1 <U 



'KU 



(95) 



Applying Lemma[T]proves the unique existence of a minimizer 
of ( TP7I ). The proof that < s < oo appears immediately after 
the statement of Proposition Q] ■ 

Appendix III 
Proof of Theorem[2] 

The following is the proof of Theorem [2] concerning the 
calculation of the OBB. 

Proof: [Proof of Theorem |2) Consider the more general 
problem of minimizing the functional 



Z[b] = [ F[b, 9]d9 
Je 



(96) 



where F[b, 9] is smooth and convex in b : 9 — ► R n , and 
9 C M. n is a bounded set with a smooth boundary A. Then, 
Z[b] is also smooth and convex in b, so that 6 is a global 
minimum of Z[b] if and only if the differential SZ[h] equals 
zero at b for all admissible functions h : 9 — » R™ [40]. 
By a standard technique [40, §35], it can be shown that 




OF 



Ob 



(n) 



u(6)hi(6)da (97) 
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where e is an infinitesimal quantity, of = dh/dOj, and 
v(9) is an outward-pointing normal at the boundary point 
9 e A. We now seek conditions for which 5Z[h] = for 
all h(9). Consider first functions h{6) which equal zero on 
the boundary A. In this case, the second integral vanishes, and 
we obtain the Euler-Lagrange equations 



OF ^ 8 OF 



de 3 db u) 



(98) 



Substituting this result back into d97l i. and again using the fact 
that SZ[h] = for all h, we obtain the boundary condition 



We now demonstrate some properties of the transformation 
of b and 9. First, we have, for any j, 



8bi 

de~ 



dh 
06> 



dbi 



dh 
86 j 



d¥. 



sm ( 



dh 
86j 



dh 
88, 



■ COS< 



dh 
88, 



(103) 



Vi, V6» g A, 



8F 



8F 



dh. 



(!)'• 



8b 



(n) 



i/(0) = 0. 



(99) 



Plugging F[b,0] = CRB[b,e]p 9 (d) into ® and <|99]) pro- 
vides the required result. ■ 



Appendix IV 
Proof of TheoremE] 

Before proving Theorem [5] we provide the following two 
lemmas, which demonstrate some symmetry properties of the 
CRB. 

Lemma 3: Under the conditions of Theorem [3] the func- 
tional Z[b] of (PL?! is rotation and reflection invariant, i.e., 
Z[b] = Z[Ub] for any unitary matrix U. 

Proof: We first demonstrate that Z[b] is rotation invari- 
ant. From the definitions of Z[b] and CRB [6,0], we have 



Z[b] 



Tr 



8b 
80 



8b\ 
dO) 



gOT 

■7(11011)' 



-de 



\b{e)f q (\\e\\)de. 



(100) 



Also, for any i, 



dh 
dh 



dh\_ ( dhddi dh d0 2 
df 2 ) ~ \d9~idh + 80~2 dfx 
dhdh dh d6 2 
de^dK 'dl 2 "df 2 
8bA 2 (dh 
89, J \89 2 



(104) 



where we used the fact that 6 = R,hO. Third, we have 



8b\ 8b\ 2 
t^t = —^ cos 
00i 56»i 

8b 2 . 



dh 
d9 2 



961 • A A 
— — sin cos <p 

de 2 

-1 — sm cos <p H — sm c 

89i d0 2 

dh , 2 dh . 

— =- sin =- sm <b cos 

001 002 

<%2 . , , <9&2 2 

— sm cos H — cos . 

001 06*2 



(105) 



The second integral is clearly rotation invariant, since a 
rotation of b does not alter its norm. It remains to show that the 
first integral, which we denote by Ii[b], does not change when 
b is rotated. To this end, we begin by considering a rotation 
about the first two coordinates, such that b is transformed to 
b = Rsb, where the rotation matrix R& is defined such that 



R<j,b = {h cos <f> + b 2 sin <j>, 

— 61 sin 4> + h cos cf> , 63 , 



,b n ) q 



(101) 



We must thus show that I\\b] = I\[b]. Let us perform the 
change of variables 9 1— ► 9, where 9 = R/^d. Rewriting 
the trace in fllOOt as a sum, we have 



m = fj:L 1 + ^ 



g(l|fl|| 

J(\\e\\Y 



-de 



(102) 



where we have used the facts that ||0|| = ||0|| and that 6 does 
not change under the change of variables. 



so that 



dh 
dh 



dh 

002 



dh 
00\ 



We now show that 



0^2 
002 ' 



E^ + ^V-E^+^V- 



J ,3 



de, 



1,3 



dOj 



(106) 



(107) 



For terms with i,j > 3, we have b t = h and 9j — 6j, so that 
replacing b with b and 9 with 9 does not change the result. 
The terms with i — 1, 2 and j > 3 do not change because of 
d 1031 >, while the terms with i > 3 and j = 1, 2 do not change 
because of ( 11041 ). It remains to show that the terms i,j = 1,2 
do not modify the sum. To this end, we write out these four 
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terms as 




2 + 2 — - + 2 — - 

89, de 2 

+ (§r) 2 + ^ 2 

2 + 2^ + 2—+ 
90i <90 2 

/ 96i \ / 96i N 




962 
<90i 



5&2 

d0 2 



^6>iy V<90 2/ 

,002, 



96i 



1 



<90 2 



90i 



(108) 



where, in the second transition, we have used (1103b . (1104b . 
and j!06b . It follows that 7i[6] of (1102b is equal to Ii[6], and 
hence Z[6] = Z[b]. The result similarly holds for rotations 
about any other two coordinates. Since any rotation can be 
decomposed into a sequence of two-coordinate rotations, we 
conclude that Z[b] is rotation invariant. 

Next, we prove that Z[b] is invariant to reflections through 
hyperplanes containing the origin. Since Z[b] is invariant to 
rotations, it suffices to choose a single hyperplane, say {0 : 
0i = 0}. Let 



b^(-b 1 (8),b 2 (8),...,b n (6)) rj 



(109) 



be the reflection of b, and consider the corresponding change 
of variables 

0^(-0i,0 2 ,...,0„) T (110) 

By the symmetry assumptions, pe and J are unaffected by the 
change of variables; furthermore, db/d6 = db/dO. It follows 
that CRB[6, 0] = CRB[6, 0], and therefore Z[b] = Z[b]. ■ 
Lemma 4: Suppose 6(0) is radial and rotation invariant, 
i.e., 6(0) = i(||0|| 2 )0 for some function t € H 1 . Also 
suppose that .7(0) = J(||0||)7, where J(-) is a scalar function. 
Then, CRB[6, 0] of (TTTT) is rotation invariant in 0, i.e., 
CRB[6, R0] = CRB[6, 0] for any rotation matrix 7?. 

Proof: We will show that CRB[6, 0] depends on only 
through ||0|| 2 , and is therefore rotation invariant. For the given 
value of 6(0) and J(0), we have 



CRB[6,0] 

= \Ue)\\- 



Tr 



'♦£W'+£ 



t 2 \\e\\ 



J(\\e\\) 



Tr 



dtO 
~86 



dte 

~d0 



(HI) 



where, for notational convenience, we have omitted the de- 
pendence of t on ||0|| 2 . It remains to show that the trace in 
the above expression is a function of only through ||0|| 2 . To 



this end, we note that 

dbi M +t>f> d ^ 2 



tdij -\- 2t 0i0j 



where da is the Kronecker delta. It follows that 



dbi 



^ ) = (1 + tf5 l3 + 4(1 + t)t'9 t 9 3 S l3 + At 



(112) 



'2/12/32 



10 " 89, I ~ v " " "' uy " ^ " " ; " " % " 3 " v ' ~ " l "J ' 



(113) 



Therefore 



Tr 



0b 



Ob 
80 



= £ * 



dbi 



= i»(l + t) 2 + 4i' 2 2 ^f + 4(1 + *)*' 53 ^ 

= ra(l + i) 2 + 4i' 2 ||0|| 4 + 4(1 + t)t'\\0\\ 2 . (114) 

Thus, CRB[6, 0] depends on only through ||0|| 2 , completing 
the proof. ■ 

Proof: [Proof of Theorem O We have seen in Theorem [2] 
that the solution of d20b is unique. Now suppose that the 
optimum 6 is not rotation invariant, i.e., there exists a rotation 
matrix 7? such that 726(0) is not identical to 6(0). By 
Lemma |3] 716(0) is also optimal, which is a contradiction. 

Furthermore, suppose that 6 is not radial, i.e., for some value 
of 0, 6(0) contains a component perpendicular to the vector 
0. Consider a hyperplane passing through the origin, whose 
normal is the aforementioned perpendicular component. By 
Lemma [5] The reflection of 6 through this hyperplane is also 
an optimal solution of d20"l i. which is again a contradiction. 
Therefore, the optimum 6 is spherically symmetric and radial, 
so that it can be written as 



m = bm) W\ 



(115) 



where b(-) is a scalar function. 

To determine the value of &(■), it suffices to analyze the 
differential equation (f2Tb along a straight line from the origin 
to the boundary. We choose a line along the 0i axis, and begin 
by calculating the derivatives of &i(0), g(||0||), and J(||0||) 
along this axis. The derivative of q(\\6 

dq 



o9j p 



where we have denoted p = 

differentiable and 

dp = 

Along the 0i axis, we have 0i = 

so that 

dq 

d9~ „ 

3 0—pe.i 

Similarly, since J{9) — J(p)I, 

89, 
so that along the 0i axis 

d{J- l ) ]k 



101 



) is given by 

(116) 
so that p is weakly 



J. 

P ' 

p while 2 



q'(p)fi. 



■ji- 



J'(p) e s 

J 2 (p) p 



89, 



9=pei 



J'(P) 
J 2 {P) 



jk 



s jk s 



J'l- 



(117) 

, = 0, 

(118) 

(119) 
(120) 
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From d 1 15b . we have 






Thus, on the #i axis, we have 

J 0—pe.i 

The second derivative of b%{ff) can be shown to equal 



b'(p)S. 



ji- 



(121) 



(122) 



9 2 ^ 



0i0j0k 



oe 3 O0 k b " {p) P 3 

V(p) b(p)\f0i 9j 

5- — Ojfc + — Oik 

p p 2 ) \p p 



Therefore, on the 9\ axis 
2 h 



Ok c 
— Oi 



JiOjO k 



(123) 



Substituting this into the definition of CRB[6. 0], we obtain 
CRB[6,/9ei] 

.^ ) + J_ (1 + tW + ^_i( 1 + M) 2 . (13 o) 

Combining ( 1 130b with dl27| ) yields (l24l i. as required. ■ 

Appendix V 
Proofs of Asymptotic Properties 

Theorems [5] and [6] demonstrate asymptotic tightness of the 
OBB. The proofs of these two theorems follow. 

Proof: [Proof of Theorem [31 We begin the proof by 
studying a certain optimization problem, whose relevance will 
be demonstrated shortly. Let t > be a constant and consider 
the problem 



O0\ 
2 b, 



00) 
2 h 



00 j O0 k 



0=pei 



6=pe 1 



9=pei 



= b"(p) 

_ b'(p) b(p) 

r2 



«(*) 



f 


Ob 


inf / 


1 +TTT 


e^ 1 Je 


00 



p 



= 



U ± 1) 



U,k^i) 



Pe(d0) 



s.t. / \\b{0)\\ z p e {d0)<t. 



(131) 



(124) 



Substituting these derivatives into (f2TT >. we obtain 

-l)^)-(n-l)^ 



Notice that u(t) <n for all t, since an objective having a value 
of n is achieved by the function b{0) = 0. Thus, it suffices 
to perform the minimization dl3U over functions b E H 1 
satisfying 



q(p)b(p) = ^(b"(p) + (n 



J(P) 

(J +/>(/'H ( -77-7 - ■W)j2(fi 



P 2 J 

(125) 



L 



r db 

I+ O0 



Pe(d0) < 



(132) 



It follows from Lemma [2] that such functions also satisfy 



which is equivalent to (f25j. 

To obtain the boundary conditions, observe that Lemma [3] 
implies 6(0) = 0, whence we conclude that 6(0) = 0. Next, 
evaluate the boundary condition d22i l at boundary point = 
rei, where the surface normal v[ff) equals e x , so that 



0b 



00 



Pe(d0) < (2^) 2 =4n. 



Therefore, ( 1131b is equivalent to the minimization 



«(*) 



f 


0b 


inf / 


/ +T^7T 


b&S t J e 


00 



l + w= 1+ ll 



0, = rex 



(126) 



where 



which is equivalent to the boundary condition b'(r) = — 1, 

To find the OBB d24l i. we must now calculate Z[b] for 
the obtained bias function ( 1115b . To this end, note that, by 
LemmalU CRB[b, 0] is rotation invariant in for the required 
b(0). Thus, the integrand CRB[b, 0]<7(||0||) is constant on any 
(n — l)-sphere centered on the origin, so that 

Z[b}= f CRB[b,pei]q(p)S n (j>)dp (127) 

Jo 

2W 2 _ , 



S t = lb £ H 1 



Pe(d0) 



\b(0)\\ 2 Pe (d0)<t, 



(133) 



(134) 



0b 



00 



Pe{d0) < An 



(135) 



where 



Sn{p) 



r(n/2)' 



(128) 



is the hypersurface area of an (n— l)-sphere of radius p [35]. 
It thus suffices to calculate the value of CRB[6, 0] at points 
along the 0\ axis. From (1121b . it follows that 



db 

00 



9=pei 



diag ( b'(p), 



7 ■ ■ ■ 1 I 

P P J 



(129) 



The set St is convex, closed, and bounded in H 1 . Applying 
Lemma [T] (with £ = 2) implies that there exists a function 
b opt G St which minimizes (1134b . and hence also minimizes 
(fT3TT ». 

Note that the objective in ( 1131b is zero if and only if 

^ = -1 a.e.(Pfl). (136) 

The only functions in H 1 satisfying this requirement are the 
functions 

6(0) =k a.e. ( Pe ) (137) 

for some constant fc £ R™. Let /i. = E{0} and define 

v^E{\\0-E{0}\\ 2 }. (138) 
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For functions of the form (11371 ), the constraint of ( 11311 ) is given 
by 

f \\k - 0fp e (d0) = f ' ||fc - /i + /i - 9\\ 2 pe(d0) 

J0 J0 



and note that Ajy > for all N, since J^ '(0) is positive 
definite. Thus 



|fc-/x|| 2 



'e 



>t>. 



(139) 



In ( 11391 ), equality is obtained if and only if k — fi. Therefore, 
if t < v, no functions satisfying ( 11361 ) are feasible, and thus 



(140) 



3?) (*"« 






p e (d0) 



u(t) = if i > u, 
u(t) > ift<v. 



We now return to the setting of Theorem [5] We must show 
that /3jv — > u as TV — v oo. We denote functions corresponding 
to the problem of estimating from x^ with a superscript 
(JV). Thus, for example, Z^[6] denotes the functional Z[b] 
of ([T2l for the problem corresponding to the measurement 
vector x( N \ 

Since all eigenvalues of J^ '(0) decrease monotonically 
with N for pe-almost all 0, we have 

CRB (A,) [6, 0} < CRB (JV+1) [6, 0] (141) 

for any b £ H 1 , for pg-almost all 0, and for all N. Therefore 

Z^[b}<Z^ N+l \b}. (142) 

for any b £ H 1 and for all N. It follows that for all N 

[3 N = min zW[b] < min Z^ N+1 \b] = p N+1 (143) 

beH 1 beH 1 

so that /3at is a non-decreasing sequence. Furthermore, note 
that 

Z (N) [ft-8]=v for all N (144) 



> 



> 



1 

Ajv Je 
"(g) 

Ajv 



dbW 



00 



Pe(d0) 



(150) 



Assume by contradiction that q < v. From ( 11401 ), it then 
follows that u(q) > 0. Since all eigenvalues of j' '(0) 
decrease to zero, we have Xn ~^ 0, and thus 



Pn> 



Xn 



(151) 



This contradicts the fact (11451 ) that (3m < v. We conclude that 
q = v, as required. ■ 

Proof: [Proof of Theorem [6) The proof is analogous to 
that of Theorem [5] We begin by considering the optimization 
problem 



inf / \\b(0)\\ 2 pg(d0) 

beH 1 J e 





where v is given by ( 1138b . Therefore, @n < v for all N. Thus 
/3jv converges to some value q, and we have 



[3n <q<v for all N. 



(145) 



J-\0 



for some constant t > 0. Denote the minimum value of dl52l ) 
by w(t). Let fi — E{0} and note that 6(0) = fi satisfies 
the constraint in J152I ) for any £ > 0, and has an objective 
equal to v of (11381 ). Thus, to determine w(t), it suffices to 
minimize (11521 ) over the set 



To prove the theorem, it remains to show that q = v. 

Let £r ' be the minimizer of ( flTT i when is estimated from 
x( N ^; this minimizer exists by virtue of Proposition!]] We then 
have 

(3 N = Z^ N \b^]<q (146) 



S t = lb£H l : f \\b(0)fp e (d0) < 

M( j+ s)^K' + §)><*>4 



v, 



and therefore 



Define 



bW(O)fp (d9)<q. 



(147) 



A =esssupA max (J(0)). 



(153) 



It follows that b^ ' satisfies the constraint of the optimization 
problem ( 11311 ) with t = q. As a consequence, we have 



Since J{0) is positive definite almost everywhere, we have 
A > 0. For any b £ St, we have 



db 



(JV) 



80 



Pe (d0) > u(q). 



Define 



Xn = ess sup A E 
eee 



C (JW(0)) 



(148) 



(149) 



1 f 


db 


- \ 


I + ™ 


A7e 


d0 



pe{d0) <t 



and therefore, by Lemma [2] 



96 



00 



pe{d0) < (VtX + Vn 



(154) 



(155) 
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Hence, for any b e St, 

\b(6)\\ 2 Pe (d0) 



\b\\h = 



/e 

< u + 



96 



50 



Pe{dO) 



(156) 



Thus St is bounded for all t. It is straightforward to show that 
St is also closed and convex. Therefore, employing Lemma [T] 
(with I = 1) ensures that there exists a (unique) 6 opt € <St 
minimizing (1152b . 

Note that the objective in (1152b is if and only if b opt (9) = 
almost everywhere. So, if € St, we have iu(t) = 0, and 
otherwise w(t) > 0. Let us define 



s±E{Tr(J-\0))} 

and note that € St if and only if t > s. Thus 

u>(t) =0 for t > s 
w(i) > otherwise. 



(157) 



(158) 



Let us now return to the setting of Theorem [6] For sim- 
plicity, we denote functions corresponding to the problem 
of estimating 9 from {x^\ . . . ,x^} with a superscript 
(TV). For example, from the additive property of the Fisher 
information [2, §3.4], we have 

jW(0) = NJ(9). 



It follows that 

(N + l)CRB (Ar+1) [b, 9} > NCRB (N) [b, 9} 
for all b 6 H\ all 9 e 6, and all 7Y. Therefore 

(iV + l)Z (JV+1) [b] >NZ {N) [b] 
for all be if 1 , and hence 

(JV + 1)j9jv+i = min ((iV + l)Z^ N+1 \b] 



(159) 



(160) 



(161) 



> min 



i ( NZi 



N) 



= Np. 



N- 



(162) 



Thus {N[3n} is a non-decreasing sequence. Furthermore, we 
have 

NZ {N) [0}=s (163) 

so that N(3n < s for all N. It follows that {N/3 N } is non- 
decreasing and bounded, and therefore converges to some 
value r such that 



NPn <r <s for all N. 



(164) 



To prove the theorem, we must show that r = s. 

Let b^ ' <G H 1 denote the minimizer of ( [T7j > when 6 
is estimated from {x^ x \ . . . , i*} (the existence of b' ' 
is guaranteed by Proposition [TJ. We then have iV/3/v = 

JVZW[6 (JV) ] <r, so that 




" x w( J +^) )»(«»)< 



Thus, 6^ ' satisfies the constraint of ( 1152t with t = r. As a 
consequence, we have 



|6 (Ar) (6>)|| 2 p e (d6>)>«;(r) (166) 



and therefore 



(165) 



>n I \\^ N \e)fp e {de) 

Je 
> Nw(r). (167) 

Now suppose by contradiction that r < s. It follows from 
(11581 1 that tu(r) > 0. Hence, by ( 1167b . iV/3jv -> oo, which 
contradicts the fact that N(3n is bounded. We conclude that 
r = s, as required. ■ 
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